zero-one law
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Zero-One Laws of Graph Neural Networks
Graph neural networks (GNNs) are the de facto standard deep learning architectures for machine learning on graphs. This has led to a large body of work analyzing the capabilities and limitations of these models, particularly pertaining to their representation and extrapolation capacity. We offer a novel theoretical perspective on the representation and extrapolation capacity of GNNs, by answering the question: how do GNNs behave as the number of graph nodes become very large? Under mild assumptions, we show that when we draw graphs of increasing size from the Erdős-Rényi model, the probability that such graphs are mapped to a particular output by a class of GNN classifiers tends to either zero or one. This class includes the popular graph convolutional network architecture. The result establishes `zero-one laws' for these GNNs, and analogously to other convergence laws, entails theoretical limitations on their capacity. We empirically verify our results, observing that the theoretical asymptotic limits are evident already on relatively small graphs.
Towards a Zero-One Law for Column Subset Selection
There are a number of approximation algorithms for NP-hard versions of low rank approximation, such as finding a rank-$k$ matrix $B$ minimizing the sum of absolute values of differences to a given $n$-by-$n$ matrix $A$, $\min_{\textrm{rank-}k~B}\|A-B\|_1$, or more generally finding a rank-$k$ matrix $B$ which minimizes the sum of $p$-th powers of absolute values of differences, $\min_{\textrm{rank-}k~B}\|A-B\|_p^p$. Many of these algorithms are linear time columns subset selection algorithms, returning a subset of $\poly(k \log n)$ columns whose cost is no more than a $\poly(k)$ factor larger than the cost of the best rank-$k$ matrix.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Reviews: Towards a Zero-One Law for Column Subset Selection
As a caveat, I am not an expert in the literature surrounding low-rank reconstruction, and may not be entirely correct in my evaluation of the originality and significance of the contributions. Originality:This paper builds upon previous work, in particular [62], which developed column-subset selection for low-rank approximation under the l_p norm. This paper expands upon [62], obtaining results for a broader class of functions and furthermore tightening and fixing some results from [62]. These expansions seem very valuable to the machine learning community. However, the authors may want to further motivate their work by providing specific examples of loss functions to which they extend previous theory, and which have found successful applications in machine learning.
Zero-One Laws of Graph Neural Networks
Graph neural networks (GNNs) are the de facto standard deep learning architectures for machine learning on graphs. This has led to a large body of work analyzing the capabilities and limitations of these models, particularly pertaining to their representation and extrapolation capacity. We offer a novel theoretical perspective on the representation and extrapolation capacity of GNNs, by answering the question: how do GNNs behave as the number of graph nodes become very large? Under mild assumptions, we show that when we draw graphs of increasing size from the Erdős–Rényi model, the probability that such graphs are mapped to a particular output by a class of GNN classifiers tends to either zero or one. This class includes the popular graph convolutional network architecture.
Towards a Zero-One Law for Column Subset Selection
There are a number of approximation algorithms for NP-hard versions of low rank approximation, such as finding a rank- k matrix B minimizing the sum of absolute values of differences to a given n -by- n matrix A, \min_{\textrm{rank-}k B}\ A-B\ _1, or more generally finding a rank- k matrix B which minimizes the sum of p -th powers of absolute values of differences, \min_{\textrm{rank-}k B}\ A-B\ _p p . Many of these algorithms are linear time columns subset selection algorithms, returning a subset of \poly(k \log n) columns whose cost is no more than a \poly(k) factor larger than the cost of the best rank- k matrix. The above error measures are special cases of the following general entrywise low rank approximation problem: given an arbitrary function g:\mathbb{R} \rightarrow \mathbb{R}_{\geq 0}, find a rank- k matrix B which minimizes \ A-B\ _g \sum_{i,j}g(A_{i,j}-B_{i,j}) . A natural question is which functions g admit efficient approximation algorithms? Indeed, this is a central question of recent work studying generalized low rank models.
Zero-One Laws of Graph Neural Networks
Adam-Day, Sam, Iliant, Theodor Mihai, Ceylan, İsmail İlkan
Graph neural networks (GNNs) are the de facto standard deep learning architectures for machine learning on graphs. This has led to a large body of work analyzing the capabilities and limitations of these models, particularly pertaining to their representation and extrapolation capacity. We offer a novel theoretical perspective on the representation and extrapolation capacity of GNNs, by answering the question: how do GNNs behave as the number of graph nodes become very large? Under mild assumptions, we show that when we draw graphs of increasing size from the Erd\H{o}s-R\'enyi model, the probability that such graphs are mapped to a particular output by a class of GNN classifiers tends to either zero or to one. This class includes the popular graph convolutional network architecture. The result establishes 'zero-one laws' for these GNNs, and analogously to other convergence laws, entails theoretical limitations on their capacity. We empirically verify our results, observing that the theoretical asymptotic limits are evident already on relatively small graphs.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.28)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Towards a Zero-One Law for Column Subset Selection
Song, Zhao, Woodruff, David, Zhong, Peilin
There are a number of approximation algorithms for NP-hard versions of low rank approximation, such as finding a rank-$k$ matrix $B$ minimizing the sum of absolute values of differences to a given $n$-by-$n$ matrix $A$, $\min_{\textrm{rank-}k B}\ A-B\ _1$, or more generally finding a rank-$k$ matrix $B$ which minimizes the sum of $p$-th powers of absolute values of differences, $\min_{\textrm{rank-}k B}\ A-B\ _p p$. Many of these algorithms are linear time columns subset selection algorithms, returning a subset of $\poly(k \log n)$ columns whose cost is no more than a $\poly(k)$ factor larger than the cost of the best rank-$k$ matrix. A natural question is which functions $g$ admit efficient approximation algorithms? Indeed, this is a central question of recent work studying generalized low rank models. In this work we give approximation algorithms for {\it every} function $g$ which is approximately monotone and satisfies an approximate triangle inequality, and we show both of these conditions are necessary.